Math 32 - 28: Beta Distribution

Today: Beta Distribution

Goal: Explore a distribution of proportions

Objectives:

explore the beta distribution
explore the gamma function

Odds

In probability, the saying

\[\text{the odds of observing } c \text{ is }a \text{ to } b\]

is equivalent to the probability

\[P(c) = \displaystyle\frac{a}{a + b}\]

Binomial Likelihood

Since we have a two-state situation of observing \(a\) or not among \(a + b\) trials, we can envision a binomial situation \(\text{Bin}(a + b, a)\), while probability \(x\) obeys \(0 \leq x \leq 1\). That is, we might want some flexibility in understanding our probability \(x\). By Bayes’ Rule, the posterior distribution is

\[P(X = x | N = a) = \displaystyle\frac{P(N = a | X = x) \cdot P(X = x)}{P(N = a)}\]

while the likelihood is seen from a binomial distribution

\[P(N = a | X = x) = \binom{a+b}{a}x^{a}(1-x)^{b}\]

Beta Distribution

If \(X \sim \text{Beta}(\alpha, \beta)\), then the probability density function (PDF) is

\[f(X = x) = \begin{cases} \displaystyle\frac{1}{B(\alpha, \beta)}x^{\alpha - 1}(1-x)^{\beta - 1}, & 0 < x < 1 \\ 0, & \text{otherwise} \end{cases}\]

where the normalization constant

\[B(\alpha, \beta) = \displaystyle\int_{0}^{1} \! x^{\alpha - 1}(1-x)^{\beta - 1} \, dx\]

to ensure that the total area under the curve is one unit.

Offsets

The offset notation \[\begin{array}{rcl} \alpha & = & a + 1 \\ \beta & = & b + 1 \\ \end{array}\] is there to streamline the statistics seen later (such as the expected value and the variance).

Gamma Function

That normalization constant

\[B(\alpha, \beta) = \displaystyle\frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha + \beta)}\]

can be viewed in terms of the

\[\Gamma(x) = \displaystyle\int_{0}^{\infty} \! t^{x-1}e^{-t} \, dt\]

Claim: \(\Gamma(x+1) = x\Gamma(x)\)

Connection to Factorials

Along with computing \(\Gamma(1) = 1\), it follows that for natural numbers \[\Gamma(x) = (x-1)!\]

Generalized Factorial Function

However, the gamma function allows us to input real numbers. For example,

\[\begin{array}{rcll} \Gamma\left(\displaystyle\frac{1}{2}\right) & = & \displaystyle\int_{0}^{\infty} \! t^{-1/2}e^{-t} \, dt & \text{defintion of gamma function} \\ ~ & = & \displaystyle\int_{0}^{\infty} \! u^{-1}e^{-u^{2}}(2u) \, du & \text{substitution } t = u^{2} \rightarrow dt = 2u \, du \\ ~ & = & 2\displaystyle\int_{0}^{\infty} \! e^{-u^{2}} \, du & \text{algebra} \\ ~ & = & \displaystyle\int_{-\infty}^{\infty} \! e^{-u^{2}} \, du & \text{even function} \\ ~ & = & \sqrt{\pi} & \text{Gaussian} \\ \end{array}\]

Beta Distribution

In terms of the gamma function, we now have that the beta distribution PDF is

\[f(X = x) = \begin{cases} \displaystyle\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha - 1}(1-x)^{\beta - 1}, & 0 < x < 1 \\ 0, & \text{otherwise} \end{cases}\] To understand the probabilistic environment, let us derive the expected value.

Expected Value

\[\begin{array}{rcl} \text{E}[X] & = & \displaystyle\int_{-\infty}^{\infty} \! x \cdot f_{X}(x) \, dx \\ ~ & = & \displaystyle\int_{0}^{1} \! x \cdot \displaystyle\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha - 1}(1-x)^{\beta - 1} \, dx \\ ~ & = & \displaystyle\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\displaystyle\int_{0}^{1} \! x^{(\alpha+1) - 1}(1-x)^{\beta - 1} \, dx \\ ~ & = & \displaystyle\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)} \cdot \displaystyle\frac{\Gamma(\alpha + 1)\Gamma(\beta)}{\Gamma(\alpha + \beta + 1)}\\ ~ & = & \displaystyle\frac{\Gamma(\alpha + 1)}{\Gamma(\alpha)} \cdot \displaystyle\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha + \beta + 1)} \\ ~ & = & \displaystyle\frac{\alpha\Gamma(\alpha)}{\Gamma(\alpha)} \cdot \displaystyle\frac{\Gamma(\alpha + \beta)}{(\alpha + \beta)\Gamma(\alpha + \beta)} \\ ~ & = & \displaystyle\frac{\alpha}{\alpha + \beta} \\ \end{array}\]

Variance

Similarly, the variance for \(X \sim \text{Beta}(\alpha, \beta)\) is \[\sigma^{2} = \displaystyle\frac{\alpha\beta}{(\alpha + \beta + 1)(\alpha + \beta)^{2}}\]

Example

If we have 3 heads and 2 tails in a trial of flipping an unfair coin, assume a beta distribution and build a range-rule-of-thumb interval \((\mu - 2\sigma, \mu + 2\sigma)\) for the posterior probability.

Parameters

\[a = 3, b = 2 \quad\Rightarrow\quad \alpha = 4, \beta = 3\] \[\mu = \text{E}[X] = \displaystyle\frac{\alpha}{\alpha + \beta} = \displaystyle\frac{4}{7}, \quad \sigma^{2} = \displaystyle\frac{\alpha\beta}{(\alpha + \beta + 1)(\alpha + \beta)^{2}} = \displaystyle\frac{12}{8(7)^{2}}\] and our range-rule-of-thumb interval is \[\displaystyle\frac{4}{7} \pm \displaystyle\frac{2}{7}\sqrt{\displaystyle\frac{3}{2}}\] or approximately \((0.2215, 0.9214)\)